AITopics | noise model

We introduce a simple yet powerful framework for training large language models. In contrast to the standard autoregressive next-token prediction based on an exact prefix, we propose a perturbation-based procedure that first transforms the prefix into a semantic neighbor and then conditions on this perturbed variant for next-token prediction. This yields a hierarchical model with a pre-post-additive noise structure. Within this framework, we develop a rigorous theory of extrapolability, namely, the capacity of a model class to make reliable predictions for token sequences that lie outside the empirical support of the training corpus. We evaluate the finite-sample performance of the proposed procedure using both synthetic and real-world language data. Results show that the proposed method consistently improves out-of-support prediction while maintaining competitive in-support performance, demonstrating that perturbation offers a practical route to language modeling.

large language model, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2605.04344

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.87)

Industry:

Government > Voting & Elections (0.67)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Unsupervised Image Denoising with Score Function

Neural Information Processing SystemsMay-1-2026, 04:56:12 GMT

Though achieving excellent performance in some cases, current unsupervised learning methods for single image denoising usually have constraints in applications. In this paper, we propose a new approach which is more general and applicable to complicated noise models. Utilizing the property of score function, the gradient of logarithmic probability, we define a solving system for denoising. Once the score function of noisy images has been estimated, the denoised result can be obtained through the solving system. Our approach can be applied to multiple noise models, such as the mixture of multiplicative and additive noise combined with structured correlation. Experimental results show that our method is comparable when the noise model is simple, and has good performance in complicated cases where other methods are not applicable or perform poorly.

artificial intelligence, machine learning, noise model, (15 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Appendix - Scalable Bayesian GPFA with automatic relevance determination and discrete noise models AFurther analyses of preparatory dynamics in the primate reaching task max sim

Neural Information Processing SystemsApr-26-2026, 00:56:00 GMT

Here we briefly consider why introducing a prior over the factor matrix enables automatic relevance determination. These ideas reflect results by Bishop [1] and our experiments in Section 3.1. For simplicity, we will first consider the case of factor analysis where p(X) = Q d,tN(xdt; 0,1).

artificial intelligence, dimension, machine learning, (18 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

58238e9ae2dd305d79c2ebc8c1883422-Paper.pdf

Neural Information Processing SystemsApr-26-2026, 00:44:44 GMT

artificial intelligence, bgpfa, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Cognitive Science (0.69)
(2 more...)

Add feedback

Noise2Score: Tweedie's Approach to Self-Supervised Image Denoising without Clean Images

Neural Information Processing SystemsApr-24-2026, 12:49:32 GMT

Recently, there has been extensive research interest in training deep networks to denoise images without clean reference. However, the representative approaches such as Noise2Noise, Noise2Void, Stein's unbiased risk estimator (SURE), etc. seem to differ from one another and it is difficult to find the coherent mathematical structure. To address this, here we present a novel approach, called Noise2Score, which reveals a missing link in order to unite these seemingly different approaches. Specifically, we show that image denoising problems without clean images can be addressed by finding the mode of the posterior distribution and that the Tweedie's formula offers an explicit solution through the score function (i.e. the gradient of loglikelihood). Our method then uses the recent finding that the score function can be stably estimated from the noisy images using the amortized residual denoising autoencoder, the method of which is closely related to Noise2Noise or Nose2Void. Our Noise2Score approach is so universal that the same network training can be used to remove noises from images that are corrupted by any exponential family distributions and noise parameters. Using extensive experiments with Gaussian, Poisson, and Gamma noises, we show that Noise2Score significantly outperforms the state-of-the-art self-supervised denoising methods in the benchmark data set such as (C)BSD68, Set12, and Kodak, etc.

Add feedback

Double Gumbel Q-Learning

Neural Information Processing SystemsApr-24-2026, 11:10:02 GMT

We show that Deep Neural Networks introduce two heteroscedastic Gumbel noise sources into Q-Learning. To account for these noise sources, we propose Double Gumbel Q-Learning, a Deep Q-Learning algorithm applicable for both discrete and continuous control. In discrete control, we derive a closed-form expression for the loss function of our algorithm. In continuous control, this loss function is intractable and we therefore derive an approximation with a hyperparameter whose value regulates pessimism in Q-Learning. We present a default value for our pessimism hyperparameter that enables DoubleGum to outperform DDPG, TD3, SAC, XQL, quantile regression, and Mixture-of-Gaussian Critics in aggregate over 33 tasks from DeepMind Control, MuJoCo, MetaWorld, and Box2D and show that tuning this hyperparameter may further improve sample efficiency.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre: Research Report > New Finding (0.45)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Efficient Neural Codes under Metabolic Constraints

Zhuo Wang, Xue-Xin Wei, Alan A. Stocker, Daniel D. Lee

Neural Information Processing SystemsApr-22-2026, 11:26:12 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, constraint, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.15)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.96)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Machine Learning (0.50)

Add feedback

Noise-Tolerant Life-Long Matrix Completion via Adaptive Sampling

Maria-Florina F. Balcan, Hongyang Zhang

Neural Information Processing SystemsApr-22-2026, 09:42:31 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, noise, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Normalized Spectral Map Synchronization

Yanyao Shen, Qixing Huang, Nati Srebro, Sujay Sanghavi

Neural Information Processing SystemsApr-22-2026, 00:03:21 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, optimization problem, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.29)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.89)

Add feedback

Last-Iterate Guarantees for Learning in Co-coercive Games

Chandak, Siddharth, Tamizholi, Ramanan, Bambos, Nicholas

arXiv.org Machine LearningApr-22-2026

We establish finite-time last-iterate guarantees for vanilla stochastic gradient descent in co-coercive games under noisy feedback. This is a broad class of games that is more general than strongly monotone games, allows for multiple Nash equilibria, and includes examples such as quadratic games with negative semidefinite interaction matrices and potential games with smooth concave potentials. Prior work in this setting has relied on relative noise models, where the noise vanishes as iterates approach equilibrium, an assumption that is often unrealistic in practice. We work instead under a substantially more general noise model in which the second moment of the noise is allowed to scale affinely with the squared norm of the iterates, an assumption natural in learning with unbounded action spaces. Under this model, we prove a last-iterate bound of order $O(\log(t)/t^{1/3})$, the first such bound for co-coercive games under non-vanishing noise. We additionally establish almost sure convergence of the iterates to the set of Nash equilibria and derive time-average convergence guarantees.

artificial intelligence, co-coercive game, machine learning, (16 more...)

arXiv.org Machine Learning

2604.19065

Country: